_____________________________________________________________________________

Overview of the source QAQC file(s)

_____________________________________________________________________________


csv data file analyzed in this report:

## File located in folder: Connectivity-Networks/
file
LAGOSUS_NETSv1.0_MedRes_Metrics_Dams.csv


Structure of the QAQC file:

## 'data.frame':    86511 obs. of  22 variables:
##  $ lagoslakeid                  : chr  "8056" "19412" "7956" "68446" ...
##  $ lake_net_upstreamlake_km     : num  NA 2.02 4.37 NA NA ...
##  $ lake_net_downstreamlake_km   : num  NA 11.766 0.385 88.237 NA ...
##  $ lake_net_bidirectionallake_km: num  21.717 2.02 0.385 22.559 19.803 ...
##  $ lake_net_upstreamlake_n      : int  0 1 5 0 0 0 3 0 1 0 ...
##  $ lake_net_downstreamlake_n    : int  0 1 1 1 0 1 1 1 1 0 ...
##  $ lake_net_lakeorder           : int  1 2 4 0 2 0 2 0 1 0 ...
##  $ lake_net_lnn                 : int  1 2 4 1 1 1 5 1 2 1 ...
##  $ net_id                       : int  33 33 33 33 33 33 33 33 33 33 ...
##  $ net_lakes_n                  : int  1731 1731 1731 1731 1731 1731 1731 1731 1731 1731 ...
##  $ net_averagelakedistance_km   : num  132 132 132 132 132 ...
##  $ net_averagelakearea_ha       : num  127 127 127 127 127 ...
##  $ lake_net_nearestdamdown_km   : num  NA NA NA 118 0 ...
##  $ lake_net_nearestdamdown_id   : chr  NA NA NA "18622" ...
##  $ lake_net_totaldamdown_n      : int  0 0 0 1 2 0 0 0 0 0 ...
##  $ lake_net_nearestdamup_km     : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ lake_net_nearestdamup_id     : chr  NA NA NA NA ...
##  $ lake_net_totaldamup_n        : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ lake_net_damonlake_flag      : chr  "N" "N" "N" "N" ...
##  $ lake_net_multidam_flag       : chr  "N" "N" "N" "N" ...
##  $ net_dams_n                   : int  286 286 286 286 286 286 286 286 286 286 ...
##  $ nhdplusv2_comid              : int  4290157 719608 725058 803459 802769 4290235 4290227 4289089 4289087 4289003 ...


_____________________________________________________________________________

Checks on matching QAQC file and GIS layer

_____________________________________________________________________________


GIS layer used for comparison:

## File geodatabase is ../../../LAGOS_US_GIS_Data_v0.7.gdb
## 
## GDB layer is LAGOS_US_All_Lakes_1ha_points


Are there the same number of rows in the QAQC file and GIS layer?

## Number of lagoslakeids in the QAQC file is 86511
## Number of lagoslakeids in the GIS shapefile is 479950


lagoslakeids in the GIS layer that don’t match the QAQC file:

## 393439  lagoslakeids in the GIS gdb are NOT matched in the QAQC dataset


lagoslakeids in the QAQC file that don’t match the GIS File:

## All lagoslakeids in the QAQC dataset are matched in the GIS gdb


_____________________________________________________________________________

Checks on matching variable names in the qaqc file with those in the metadata file

_____________________________________________________________________________


Metadata file used for comparison:

## /Users/kathe/Dropbox/CL_HUB_DOC/Data_Dictionary/GEO_metric_metadata_WIP.xlsx


Variables in the QAQC file that didn’t match the metadata:

## All QAQC variable names match the metadata


Metadata on QAQC variables that were matched:

variable_name variable_description units data_type taxonomy_type n
lagoslakeid Unique lake identifier developed by LAGOS-US NULL int key 1
lake_net_bidirectionallake_km Distance to the nearest lake upstream or downstream using bi-directional graph. kilometers numeric derived 1
lake_net_damonlake_flag A value of ‘Y’ indicates that there is at least one dam on this lake. NULL factor information 1
lake_net_downstreamlake_km Distance to nearest downstream lake using a unidirectional graph. kilometers numeric derived 1
lake_net_downstreamlake_n The number of lakes directly connected through streams downstream of a lake. number int derived 1
lake_net_lakeorder Lake order follows the Strahler stream order of the stream that flows from it (outflowing), where the higher order stream is chosen if more than one outlet occurs (Riera et al. 2000, Martin and Soranno 2006). The exception is that headwater lakes are 0. If the lake is a terminal lake, it will receive the order of the highest inflowing stream. NULL int derived 1
lake_net_lnn Lake network number (LNN) is the position of a lake within the network in reference to other lakes. The lake at the top of a network (i.e. no upstream lakes) will be 1, the next lake downstream will be 2, etc. If a lake has more than one lake upstream it will take the higher LNN. NULL int derived 1
lake_net_multidam_flag A value of ‘Y’ indicates that there are multiple dams on a lake. NULL factor information 1
lake_net_nearestdamdown_id The dam ID for the nearest downstream dam. Dam IDs are from the NABD dataset. NULL char information 1
lake_net_nearestdamdown_km Distance to nearest downstream dam. kilometers numeric derived 1
lake_net_nearestdamup_id The dam ID for the nearest upstream dam. Dam IDs are from the NABD dataset. NULL char information 1
lake_net_nearestdamup_km Distance to nearest upstream dam. kilometers numeric derived 1
lake_net_totaldamdown_n The total number of dams downstream from a lake. number int derived 1
lake_net_totaldamup_n The total number of dams upstream from a lake. number int derived 1
lake_net_upstreamlake_km Distance to nearest upstream lake using a unidirectional graph. kilometers numeric derived 1
lake_net_upstreamlake_n The number of upstream lakes directly connected through streams to a lake. number int derived 1
net_averagelakearea_ha Average lake area in a network. hectares numeric derived 1
net_averagelakedistance_km Average distance between lakes in a network. kilometers numeric derived 1
net_dams_n The number of total dams in a network. number int derived 1
net_id The unique identifier assigned by LAGOS-NETS for each network NULL int derived 1
net_lakes_n The total number of lakes in the lake network. number int derived 1
nhdplusv2_comid Unique lake identifier from the nhd for the medium resolution NHDplusV2. NULL char key 1


_____________________________________________________________________________

Summary of lagoslakeid and any other categorical variables in the QAQC file

_____________________________________________________________________________

## Number of unique lagoslakeids = 86511


Data Frame Summary

qaqc_char

Dimensions: 86511 x 4
Duplicates: 53050
No Variable Stats / Values Freqs (% of Valid) Graph Valid Missing
1 lake_net_nearestdamdown_id [character] 1. 11559 2. 38221 3. 99004 4. 2032 5. 49764 6. 17163 7. 24255 8. 24330 9. 48892 10. 17069 [ 30837 others ]
519(1.3%)
376(0.9%)
185(0.5%)
155(0.4%)
145(0.4%)
121(0.3%)
110(0.3%)
105(0.3%)
104(0.3%)
84(0.2%)
38197(95.2%)
40101 (46.35%) 46410 (53.65%)
2 lake_net_nearestdamup_id [character] 1. 2064 2. 47302 3. 1641 4. 17010 5. 332 6. 1580 7. 17067 8. 17069 9. 18517 10. 2122 [ 7117 others ]
6(0.1%)
5(0.1%)
4(0.1%)
4(0.1%)
4(0.1%)
3(0.0%)
3(0.0%)
3(0.0%)
3(0.0%)
3(0.0%)
7173(99.5%)
7211 (8.34%) 79300 (91.66%)
3 lake_net_damonlake_flag [character] 1. N 2. Y
73881(85.4%)
12630(14.6%)
86511 (100%) 0 (0%)
4 lake_net_multidam_flag [character] 1. N 2. Y
86379(99.9%)
132(0.1%)
86511 (100%) 0 (0%)

Generated by summarytools 0.9.6 (R version 4.0.3)
2020-10-27


_____________________________________________________________________________

Summary of numeric variables in the QAQC file

_____________________________________________________________________________


Number of missing, zero, negative, and positive observations

qvar Missing Zero Negative Positive Total_n Unique
lake_net_bidirectionallake_km 0 137 0 86374 86511 23031
lake_net_downstreamlake_km 10655 73 0 75783 86511 49635
lake_net_downstreamlake_n 0 10655 0 75856 86511 14
lake_net_lakeorder 1 30861 0 55649 86511 11
lake_net_lnn 0 0 0 86511 86511 50
lake_net_nearestdamdown_km 46410 27417 0 12684 86511 12041
lake_net_nearestdamup_km 79300 520 0 6691 86511 4919
lake_net_totaldamdown_n 0 46410 0 40101 86511 33
lake_net_totaldamup_n 0 79300 0 7211 86511 138
lake_net_upstreamlake_km 63003 71 0 23437 86511 7147
lake_net_upstreamlake_n 0 63003 0 23508 86511 194
net_averagelakearea_ha 0 0 0 86511 86511 891
net_averagelakedistance_km 0 0 0 86511 86511 875
net_dams_n 0 1431 0 85080 86511 90
net_lakes_n 0 0 0 86511 86511 128


Summary statistics of numeric variables

qvar n mean sd median min max
lake_net_bidirectionallake_km 86511 8.398383 12.32958 4.3910 0.000 284.717
lake_net_downstreamlake_km 86511 164.798025 345.34992 26.0075 0.000 2412.315
lake_net_downstreamlake_n 86511 1.436245 1.66076 1.0000 0.000 13.000
lake_net_lakeorder 86511 1.088001 1.14510 1.0000 0.000 9.000
lake_net_lnn 86511 1.644693 1.97461 1.0000 1.000 50.000
lake_net_nearestdamdown_km 86511 37.040721 124.43498 0.0000 0.000 1703.461
lake_net_nearestdamup_km 86511 7.895516 16.50401 2.6630 0.000 269.745
lake_net_totaldamdown_n 86511 1.309198 3.07499 0.0000 0.000 32.000
lake_net_totaldamup_n 86511 0.800187 47.73874 0.0000 0.000 6027.000
lake_net_upstreamlake_km 86511 3.101604 7.85933 0.9870 0.000 220.334
lake_net_upstreamlake_n 86511 1.436245 56.05566 0.0000 0.000 7310.000
net_averagelakearea_ha 86511 79.696370 246.26401 80.4690 1.190 47157.153
net_averagelakedistance_km 86511 756.513007 709.35430 350.5050 0.015 1652.416
net_dams_n 86511 9864.936944 11827.92381 1176.0000 0.000 24986.000
net_lakes_n 86511 13254.641618 15306.75484 2397.0000 2.000 32811.000


_____________________________________________________________________________

Check on sums of percent composition variables

_____________________________________________________________________________

## Check not relevant to this dataset


_____________________________________________________________________________

Check on zonal data completeness

_____________________________________________________________________________


Summary of zones with datacoveragepct between 0 and 100 (Incomplete) or = 0 (Zero)

## Check not relevant to this dataset


_____________________________________________________________________________

Checks on missing values

_____________________________________________________________________________


Summary and maps of missing character values


## No character variables have missing values


Summary and maps of missing numeric values

## 199369 numeric variable observations have missing values
## 
## NOTE: table not printed if there are > 100 missing values



_____________________________________________________________________________

Spatial patterns of (selected) QAQC character variables

_____________________________________________________________________________


## 100% of the data are plotted for LAGOSUS_NETSv1.0_MedRes_Metrics_Dams
## Points are lake polygon centroids
## Note: When character variables have nearly unique values (e.g., names), maps are not created


_____________________________________________________________________________

Spatial patterns of numeric variables

_____________________________________________________________________________


## 100% of the data are plotted for LAGOSUS_NETSv1.0_MedRes_Metrics_Dams
## Points are lake polygon centroids


_____________________________________________________________________________

Overview of results of QAQC checks

_____________________________________________________________________________